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Abstract. We describe soft versions of the global cardinality constraint 
and the regular constraint, with efficient filtering algorithms maintaining 
domain consistency. For both constraints, the softening is achieved by 
augmenting the underlying graph. The softened constraints can be used 
to extend the meta-constraint framework for over-constrained problems 
proposed by Petit, Regin and Bessiere. 



1 Introduction 

Constraint Programming (CP) is a widely used and efficient technique to solve 
combinatorial optimization problems. However in practice many problems are 
over-constrained (intrinsically or from being badly stated). Several frameworks 
have been proposed to handle over-constrained problems, mostly by introducing 
soft constraints that are allowed to be (partially) violated. The most well-known 
framework is the Partial Constraint Satisfaction Problem framework (PCSP [H]), 
which includes the Max-CSP framework that tries to maximize the number of 
satisfied constraints. Since in this framework all constraints are either violated 
or satisfied, this objective is equivalent to minimizing the number of violations. 
It has been extended to the Weighted-CSP associating a degree of vi- 

olation (not just a boolean value) to each constraint and minimizing the sum 
of all weighted violations. The Possibilistic-CSP ^H] associates a preference to 
each constraint (a real value between and 1) representing its importance. The 
objective of the framework is the hierarchical satisfaction of the most important 
constraints, that is, the minimization of the highest preference level for a violated 
constraint. The Fuzzy-CSP j6l7| is somewhat similar to the Possibilistic-CSP 
but here a preference is associated to each tuple of each constraint. A preference 
value of means the constraint is highly violated and 1 stands for satisfaction. 
The objective is the maximization of the smallest preference value induced by 
a variable assignment. The last two frameworks are different from the previous 
ones since the aggregation operator is a min/max function instead of addition. 
Max-CSPs are typically encoded and solved with one of two generic paradigms: 
valued-CSPs 19 and semi- rings 5 . 



Another approach to model and solve over-constrained problems involves 
Meta-Constraints 13 . The idea behind this technique is to introduce a set 
of domain variables Z that capture the violation cost of each soft constraint. 
By correctly constraining these variables it is possible to replicate the previous 
frameworks and even to extend the modeling capability to capture other types 
of violation measures. Namely the authors argue that although the Max-CSP 
family of frameworks is quite efficient to capture local violation measures it is 
not as adequate to model violation costs involving several soft constraints si- 
multaneously. By defining (possibly global) constraints on Z such a behaviour 
can be easily achieved. The authors propose to replace each soft constraint Si 
present in a model by a disjunctive constraint specifying that either = and 
the constraint Si is hard or Zi > and Si is violated. This technique allows the 
resolution of over-constrained problem within traditional CP solvers. 

Comparatively few efforts have been invested in developing soft versions of 
common global constraints |14I4I9| . Global constraints are often key elements in 
successfully modeling real applications and being able to easily and effectively 
soften such constraints would yield a significant improvement in flexibility. In 
this paper we study two global constraints: the widely known global cardinality 
constraint (gcc) |15| and the new regular |12| constraint. For each of these we 
propose new violation measures and provide the corresponding filtering algo- 
rithms to achieve domain consistency. All the constraint softening is achieved by 
enriching the underlying graph representation with additional arcs that represent 
possible relaxations of the constraint. Violation costs are then associated to these 
new arcs and known graph algorithms are used to achieve domain consistency. 

The two constraints studied in this paper are useful to model and solve 
personnel rostering problems (PRP). The PRP objective is typically to distribute 
a set of working shifts (or days off) to a set of employees every day over a 
planning horizon (a set of days). The gcc is a perfect tool to restrict the number 
of work shifts of each type (Day, Evening, and Night for instance) performed by 
each employee. Other types of constraints involve sequences of shifts over time, 
typically forbidding non ergonomic schedules. The regular constraint has the 
expressive power necessary to cope with the complex regulations found in many 
organizations. Since most real rostering applications are over-constrained (due 
to lack of personnel or over-optimistic scheduling objectives), soft versions of 
the gcc and regular constraints promise to significantly improve our modelling 
flexibility. 

This paper is organized as follows. Section |2] presents background informa- 
tion on Constraint Programming and the softening of (global) constraints. In 
Section 13 and 01 we describe the softening of the gcc and the regular constraint 
respectively. Both constraints are softened with respect to two violation mea- 
sures. We also provide corresponding filtering algorithms achieving domain con- 
sistency. Section |31 discusses the aggregation of several soft (global) constraints 
by meta-constraints. Finally, a conclusion is given in Section |B1 



2 Background 



We assume familiarity with the basic concepts of constraint programming. For 
a thorough explanation of constraint programming, see 

A constraint satisfaction problem (CSP) consists of a finite set of variables 
X = {xi, . . . , a;„} with finite domains P = {Z3i, . . . , Z?„} such that Xi £ Di 
for all i, together with a finite set of constraints C, each on a subset oi X. K 
constraint C G C is defined as a subset of the Cartesian product of the domains 
of the variables that are in C. A tuple (di, . . . , rf„) e Di x • • • x £)„ is a solution 
to a CSP if for every constraint C € C on the variables Xi-^ , ■ • ■ , Xi^, we have 
(c?ij, .. .,di^) S C. A constraint optimization problem (COP) is a CSP together 
with an objective function to be optimized. A solution to a COP is a solution 
to the corresponding CSP that has an optimal objective function value. 

Definition 1 (Domain consistency). A constraint C on the variables xi, . . . , 
Xk is called domain consistent if for each variable Xi and value di G Di, there 
exist values c?i, . . . , c?i+i, . . . , dfc in Di, . . . ,Di-i,Di+i, . . . ,Dk, such that 
{di,...,dk) e C. 

Our definition of domain consistency corresponds to hyper-arc consistency or 
generalized arc consistency, which are also often used in the literature. 

Definition 2 (Consistent CSP). A CSP is domain consistent if all its con- 
straints are domain consistent. A CSP is inconsistent if it has no solution. Sim- 
ilarly for a COP. 

When a CSP is inconsistent it is also said to be over-constrained. It is then 
natural to identify soft constraints, that are allowed to be violated, and minimize 
the total violation according to some criteria. For each soft constraint C, we 
introduce a function that measures the violation, and has the following form: 

violationc : -Di x • • • x £)„ N. 

This approach has been introduced in |14, and was developed further in 0]. 
There may be several natural ways to evaluate the degree to which a global 
constraint is violated and these are not equivalent usually. A standard measure 
is the variable-based cost: 

Definition 3 (Variable-based cost). Given a constraint C on the variables 
Xi, . . . ,Xk and an instantiation di, . . . , dfc with di S Di, the variable-based cost 
of violation of C is the minimum number of variables that need to change their 
value in order to satisfy the constraint. 

Alternative measures exist for specific constraints. For example, if a constraint 
is expressible as a conjunction of binary constraints, the cost may be defined as 
the number of these binary constraints that are violated. For the soft gcc and 
the soft regular constraint, we will introduce new violation measures, that are 
likely to be more effective in practical applications. 



3 Soft Global Cardinality Constraint 



A global cardinality constraint (gcc) on a set of variables specifies the minimum 
and maximum number of times each value in the union of their domains should 
be assigned to these variables. Regin developed a domain consistency algorithm 
for the gcc, making use of network flows JS]. A variant of the gcc is the cost- 
gcc, which can be seen as a weighted version of the gcc jl6ll7) . For the cost-gcc 
a weight is assigned to each variable- value assignment and the goal is to satisfy 
the gcc with minimum total cost. 

Throughout this section, we will use the following notation (unless specified 
otherwise). Let X denote a set of variables {xi, . . . ,a;„} with respective finite 
domains Di, . . . , D„. We define Dx = ^i£{i,...,n}Di and we assume a fixed but 
arbitrary ordering on Dx- For d G Dx-, let Id-, Ud G N, with Id < Ud- Finally, let 
z be a variable with finite domain D^ , representing the cost of violation of the 
gcc. 

Definition 4 (Global cardinality constraint). 

gcc{X, l,u) = {(di, . . . ,dn) \ di e A, Id < \{di \ di = d}\ < UdV d e Dx}- 
We first give a generic definition for a soft version of the gcc. 
Definition 5 (Soft global cardinality constraint). 

soft_gcc[*](A:, l,u, z) ^ {{di, . . . ,dn,d) \ di e Di, d e D^, 

violation3oft^ccw(rfi, ■ ■ ■ <d}, 
where ★ defines a violation measure for the gcc. 

In order to define measures of violation for the gcc, it is convenient to introduce 
the following functions. 

Definition 6 (Overflow, underflow). Given gcc(X, define for all d G 
Dx 



overflow(X, d) 



underflow(X, d) 



\{xi I d}\ - Ud if \{xi I Xi = d}\ > Ud, 

otherwise, 

I - \{xi I Xi = d}\ if \{xi I Xi = d}\ < Id, 

otherwise. 



Let violations(jft_gcc[vai] denote the variable-based cost of violation (see Defini- 
tion O of the gcc. The next lemma expresses violationsoft^cc[var] in terms of the 
above functions. 



Lemma 1. Given gcc{X, I, u) 

violation3oft.gcc[var 

provided that 



(X) = max j overflow(Ar, d) , undcrflow(X, d) 

\deDx deDx / 



Y.^d<\x\<Y. ""d- (1) 



Proof. The variable-based cost of violation corresponds to the minimal num- 
ber of re-assignments of variables until both ^^^^^^ overflow(X, d) = and 
Y.deDx underflow(X, d) = 0. 

Assume YldeDx overflow(X, d) > J^deDx underflow(X, d). Variables assigned 
to values d' G Dx with overflow(X, d') > can be assigned to values d" G 
Dx with underflow(X, d") > 0, until J2deDx underflow(X, d) = 0. In order to 
achieve J^deOx overflow(X, d) = 0, we still need to re-assign the other variables 
assigned to values d' G Dx with overflow(X, d') > 0. Hence, in total we need to 
re-assign exactly J2d£Dx overflow(X, d) variables. 

Similarly when we assume X^deDx overflow(X, d) < X^deDx underflow(X, d). 

If iQJ does not hold, there is no variable assignment that satisfies the gcc. □ 

Without assumption the variable-based violation measure for the gcc cannot 
be applied. Therefore, we introduce the following value-based violation measure, 
which can also be applied when assumption ^ does not hold. 

Definition 7 (Value-based cost). For gcc{X,l,u) the value-based cost of vi- 
olation is 

overflow(X, d) -\- underflow(X, d). 

deDx 

We denote the value-based violation measure for the gcc by violationsoft^cc[vai]- 
3.1 Graph Representation 

First, we introduce the concept of a flow in a directed graph, following Schrij- 
ver EOl pp. 148-150]. 

A directed graph is a pair Q — {V, A) where ^ is a finite set of vertices and 
A is a family^ of ordered pairs from V, called arcs. For v € V, let (5'"(i;) and 
(5°"'(u) denote the family of arcs entering and leaving v respectively. 

A (directed) walk in ^ is a sequence P = uq, ai, fi, • . . , a^, ffe where fc > 0, 
vq, wi, . . . , tife G y, ai, . . . , Ofc G A and = {vi-i,Vi) for i = 1, . . . , /c. If there 
is no confusion, P may be denoted as P = vo,vi, . . . ,Vk- A (directed) walk is 
called a (directed) path if vq,. . . ,Vk are distinct. A closed (directed) walk, i.e. 
Vq = Vk, is called a (directed) circuit if wi, . . . , Wfc are distinct. 

Let s,t € V. We apply a capacity function c : A ^ R+, a demand function 
d : A ^ M+ and a cost function w : A ^ R+ on the arcs. A function / : A ^ R 
is called a feasible flow from s to t, or an s — t flow, if 

dia) < f{a) < c(a) for each a e A, (2) 
/('5™*(«)) = fiS'^'iv)) for each veV\{s, i}, (3) 

where f{S) — J2aes ^'-'^ S C A. Property Q ensures flow conservation, 
i.e. for a vertex v ^ s,t, the amount of flow entering v is equal to the amount of 
flow leaving v. The value of an s — t flow / is defined as 

value(/) = /(<5°^*(s)) - f{S"\s)). 

^ A family is a set in which elements may occur more than once. 



In other words, the value of a flow is the net amount of flow leaving s, which can 
be shown to be equal to the net amount of flow entering t. The cost of a flow / 
is defined as 

cost(/) = w{a)f{a). 

aeA 

A minimum-cost flow is a feasible s — t flow of minimum cost. The minimum-cost 
flow problem is the problem of finding such a minimum-cost flow. 

Theorem 1 (|15p. A solution to gcc(X,l,u) corresponds to a feasible s — t 
flow of value n in the graph Q — {V, A) with vertex set 

V = XLlDxii{s,t} 

and edge set 

A = A,^x U Ax^Dx U Aox^t, 

where 

As^x = {{s,Xi) I i e {1, . . . ,n}}, 
Ax^Dx = {{xi,d) \de Di,ie {l,...,n}}, 
Aox^t = {{d,t) \deDx}, 



with demand function 



1 if a e As^x, 
d{a) ~ ^ if a G Ax^Dx: 

Id if a ^ {d,t) e Aox^t, 



and capacity function 



1 if a e As^x, 
c{a) = { 1 if a G Ax-,d^ 

Udii a= {d,t) G Aox^t- 



Jx : 



Example 1. Consider the CSP 

xi G {1,2], X2 G {1},X3 G {1,2], Xi G {1}, 
^cc{X,l,u) 

where X = {xi, . . . ,^4}, li = 1, I2 = "i, ui — 2 and U2 — 5. In Figure ^a the 
corresponding graph Q for the gcc by applying the above procedure is presented. 



3.2 Variable-Based Violation 

For the variable-based violation measure, we adapt the graph Q in the following 
way. We add the arc set Ax-^Dx = {{xi,d) \ d ^ Di,i G {!,..., n}}, with 
demand d{a) — 0, capacity c(a) = f for all arcs a G Ax^Dx- Further, we apply 
a cost function w : A ^ R, where 

'^^'^^ I otherwise. 
Let the resulting graph be denoted by Q^ar- 



a. original gcc 



b. sof t_gcc[var] 



c. sof t_gcc[val] 



Fig. 1. Graph representation for the gcc, the variable-based sof t_gcc and the 
value-based sof t_gcc. Demand and capacity are indicated between parentheses 
for each arc. Dashed arcs indicate the inserted weighted arcs. 

Example 2. Consider the CSP 



where X = {xi, . . . , X4}, h = 1, ^2 = 3, mi = 2 and U2 = 5. In Figure ^b the 
graph 5var for the sof t_gcc[var] is presented. 

Theorem 2. A minimum- cost flow in the graph C/var corresponds to a solution 
to the sof t_gcc[var], minimizing the variable-based violation. 

Proof. An assignment Xi = d corresponds to the arc a = {xi, d) with /(a) = 1. 
By construction, all variables need to be assigned to a value and the cost func- 
tion exactly measures the variable-based cost of violation. □ 

The graph Q^ai corresponds to a particular instance of the cost-gcc |1()I17| . 
Hence, we can apply the filtering procedures developed for that constraint di- 
rectly to the sof t_gcc[var]. The sof t_gcc[var] also inherits from the cost-gcc 
the time complexity of achieving domain consistency, being 0{n{m -\- nlogn)) 
where m = \Di\ and n — \X\. 

Note that 0] also consider the variable-based cost measure for a different 
version of the soft gcc. Their version considers the parameters I and u to be 
variables too. Hence, the variable-based cost evaluation becomes a rather poor 
measure, as we trivially can change I and u to satisfy the gcc. They fix this by 
restricting the set of variables to consider to be the set X, which corresponds to 
our situation. However, they do not provide a filtering algorithm for that case. 

3.3 Value-Based Violation 

For the value-based violation measure, we adapt the graph Q in the following way. 
We add arc sets Aundorflow = {(s, d) | (i G Dx} and ^overflow = {id,t) \ d e Dx}, 



xi e {1,2},X2 e {l},a;3 £ {1,2}, X4 e {l},z e 
sof t_gcc[var](X, I, u, z) 



{0,1,..., 



4} 



minimize z 



with demand d{a) — for all a e ^underflow U Aovorflow and capacity 



c(a) 



{ 



Id if a = (s 
oo if a G ^, 




-overflow • 



Further, we again apply a cost function w : A ^ W, where 



w{a) — 



1 if a G ^underflow U ^ovorfli 

otherwise. 



Let the resulting graph be denoted by Q^ai- 



Example 3. Consider the CSP 



xi e {1,2}, X2 £ {1},X3 G {1,2},X4 G {1}, z G {0, 1, . . . , 5} 
sof t_gcc[val](X, I, u, z) 



minimize z 



where X = {xi, . . . ^x^}, h = 1, I2 — 2, ui = 3 and U2 ~ 2. In Figure the 
graph Qy^\ for the sof t_gcc with respect to value-based cost is presented. 

Theorem 3. A minimum- cost flow in the graph ^/vai corresponds to a solution 
to the sof t_gcc[val], minimizing the value-based violation. 

Proof. An assignment Xi — d corresponds to the arc a = {xi, d) with /(a) = 1. 
By construction, all variables need to be assigned to a value and the cost func- 
tion exactly measures the value-based cost of violation. □ 

Unfortunately, the graph Q^sd does not preserve the structure of the cost- 
gcc because of the arcs ^underflow Therefore we cannot blindly apply the same 
filtering algorithms. However, it is still possible to design an efficient filtering 
algorithm for the value-based sof t_gcc (in the same spirit of the filtering algo- 
rithm for the cost-gcc), based again on flow theory. For this, we need to introduce 
the residual graph — {V, A^) of a flow f onQ — {V, A) (with respect to c and 
d), where 

A^ = {a I a G A, /(a) < c(a)} U {a^^ \ a e A, f{a) > d{a)}. 

Here a^^ = {v, u) if a = (m, v). We extend w to A^^ = {a^^ \ a G A} by defining 
w{a~^) — —vu{a) for each a £ A. 

Theorem 4. Let f be a minimum-cost flow in ^vai- Then sof t_gcc[val](X, I, u, z) 

is domain consistent if and only if 



where cost(SP(d, x^)) denotes the cost of a shortest path from d to Xi in the 
residual graph G^^i- 



min Dz > cost(/) 



and 



cost(/) -f cost(SP(d, Xi)) < maxDz y{xi,d) G Ax-^d 



Proof. From flow theory ^ we know that, given a minimum-cost flow / in 
Gvsd, if we enforce arc (x^, d) to be in a minimum-cost flow / in Gvai, cost(/) = 
cost(/) + cost(SP(d, a;^)) where SP{d,Xi) is the shortest d — Xi path in Q^^^^. 

In order for a value d e to be consistent, the cost of a minimum-cost flow 
that uses (xi, d) should be less than or equal to u\ax.Dz. By the above fact, we 
only need to compute a shortest path from d to Xi instead of a new minimum- 
cost flow. □ 



A minimum-cost flow / in Q^^i can be computed in 0{m{m ~\- nlogn)) time 
(see PP), where again m — Yll=i l-^d ^^"^ — 1^1- Compared to the complexity 
of the sof t_gcc[var], we have a factor m instead of n. This is because comput- 
ing the flow for sof t_gcc[val] is dependent on the number of arcs m rather than 
on the number variables n. A shortest d ~ Xi path in C^vai can be computed in 
0{m + nlogn) time. Hence the soft_gcc with respect to the value-based viola- 
tion measure can be made domain consistent in 0((m — n)(m + nlogn)) time 
as we need to check m ~ n arcs for consistency. 

When Z = in sof t_gcc[val](X, u, z), the arc set Aundoiflow is empty. In 
that case, ^/vai has a particular structure, i.e. the only costs appear on arcs from 
Dx to t. As pointed out in 1^ for the sof t_alldif f erent constraint, constraints 
with this structure can be checked for consistency in 0{nm) time, and domain 
consistency can be achieved in 0{m) time. The result is obtained by exploiting 
the strongly connected components^ in Q^g_\ restricted to vertex sets X and Dx- 



4 Soft Regular Constraint 

A regular constraint |il2i| on a fixed-length sequence of finite-domain variables 
requires that the corresponding sequence of values taken by these variables be- 
long to a given regular language. A deterministic finite automaton (DFA) may 
be described by a 5-tuple M = (Q, 17, i5, q^, F) where Q is a finite set of states, 

5 is an alphabet, 6 : Q x S ^ Q is a partial transition function, qo (z Q is 
the initial state, and F C Q is the set of final (or accepting) states. A finite 
sequence of symbols from an alphabet is called a string. Strings processed by 
M and ending in an accepting state from F are said to belong to the language 
defined by M, denoted L{M). The languages recognized by DFAs are precisely 
regular languages. 

Given a sequence x = (xi, X2, ■ ■ ■ , a;„) of finite-domain variables with respec- 
tive domains Di, D2, ■ ■ ■ , Dn C i7, there is a natural interpretation of the set 
of possible instantiations of x, Di x D2 x ■ • • x D„, as a subset of all strings of 
length n over S, 17". We are now ready to state the constraint. 

Definition 8 (Regular language membership constraint). Let M = 

{Q, S,S,qo, F) denote a deterministic finite automaton and x a sequence of 
finite-domain variables {xi, X2, ■ ■ ■ , Xn) with respective domains Di, D2, 

^ A strongly connected component in a directed graph Q = {V^ A) is a subset of vertices 
S '^V sucli that there exists a directed u — v path in Q for all u,v £ S. 



Dn C X!. Under a regular language membership constraint 
regular (x, Af) , any sequence of values taken by the variables ofx corresponds 
to a string in L{M). 

In jl2|. a domain consistency algorithm for the regular constraint processed 
the sequence x with the automaton M, building a layered directed multi-graph 
g = {N'^,N^, . . . , A^"+i, A) where each layer N' = {g^, q{,..., 9fQ|_i} contains 
a different node for each state of M and arcs only appear between consecutive 
layers. Each arc corresponds to a consistent variable-value pair: there is an arc 
from to q^'^^ if and only if there exists some Vj & Di such that S{qk,Vj) = qi 
and the arc belongs to a path from qQ in the first layer to a member of F in 
the last layer. The existence of such an arc, labeled Uj, constitutes a support for 
variable Xi taking value Vj . 

For example, consider a sequence x of five variables with Di = {a,6, c, o}, 
D2 = o}, D3 = {a,c, o}, D4 = {a, 6,0}, and = {a}. Figure [3 gives 
an automaton M (with its initial state labeled 1) and the resulting graph for 
constraint regular (x, M). As a result, value b is removed from D2 and D4. 




Fig. 2. A DFA (left) and its layered directed graph Q (right). 



4.1 Cost Definition 

We first give a generic definition for a soft version of the regular constraint. 

Definition 9 (Soft regular language membership constraint). Let M = 

{Q, S,S,qo, F) denote a deterministic finite automaton and x a sequence of 
finite-domain variables {xi, X2, ■ ■ ■ , Xn) with respective domains Di, D2, 
Dn C S . Let z be a finite-domain variable of domain C N representing the 
cost of a violation and let d : S* x E* — s- N 6e some distance function over strings. 
Under a soft regular language membership constraint sof tjregular[c?](x, M, z), 
for any sequence of values a taken by the variables of x we have 
min^'eL(M){rf(CT, cr')} = z. 



Our first instantiation of the distance function yields the variable-based cost: 

Definition 10 (Hamming distance). The number of positions in which two 
strings of same length differ is called their Hamming distance. 

Intuitively, such a distance represents the number of symbols we need to change 
to go from one string to the other, or cquivalcntly the number of variables whose 
value must change. Using the Hamming distance for d in the previous definition, 
z becomes the variable-based cost. 

Another distance function that is often used with strings is the following: 

Definition 11 (Edit distance). The smMllest number of insertions, deletions, 
and substitutions required to change one string into another is called the edit 
distance. 

It captures the fact that two strings that are identical except for one extra or 
missing symbol should be considered close to one another. For example, the edit 
distance between strings "bcdea" and "abode" is two: insert an 'a' at the front of 
the first string and delete the 'a' from its end. The Hamming distance between 
the same strings is five: every symbol must be changed. Edit distance is probably 
a better way to measure violations of a regular constraint. We provide a more 
natural example in the area of rostering. Given a string, we call stretch a maximal 
substring of identical values. We often need to impose restrictions on the length 
of stretches of work shifts, and these can be expressed with a regular constraint. 
Suppose stretches of a's and Vs must each be of length 2 and consider the string 
"abbaabbaab" : its Hamming distance to a string belonging to the corresponding 
regular language is 5 since changing either the first a to a 6 or 6 to an o has a 
domino effect on the following stretches; its edit distance is just 2 since we can 
insert an a at the beginning to make a legal stretch of a's and remove the b at 
the end. In this case, the edit distance reflects the number of illegal stretches 
whereas the Hamming distance is proportional to the length of the string. 

4.2 Cost Evaluation and Cost-Based Filtering 

For both cost measures, we proceed by modifying the layered directed graph 
Q built for the "hard" version of regular into graph t/var- Before, wc added 
an arc from q]. to g^^^ if 5{qk,Vj) = qi for some Vj G Df, now we relax it 
slightly to any vj G S. This only makes a difference if the domains of the 
variables are not initially full. Arcs are never removed in Q^s^i but their labels 
are updated instead. The label of an arc {ql,ql^^) is generalized to the invariant 
Vike = {vj G Di I S{qk,Vj) = qe}; as values are removed from the domain of 
variable Xi, they arc also removed from the corresponding Viki^s. The cost of 
using an arc {q^, q^^^) for variable- value pair {xi,Vj) will be zero if vj belongs to 
Viki and some positive integer cost otherwise. This cost represents the penalty for 
an individual violation. In the remainder of the section we will consider unit costs 
but the framework also makes it possible to use varying costs, e.g. to distinguish 




i=2,...,5 i=2 5 



Fig. 3. Shorthand versions of Q^ai (left) and Qcdit (right) for the DFA of Figure 

m 

between insertions and substitutions when using the edit distance. The graph on 
the left at Figure 13 is a shorthand version of ^var for the automaton of Figure El 
Since all values in S are considered, the same arcs appear between consecutive 
layers. What changes from one layer to the other are the Vike labels. 

Taking into account substitutions, common to both Hamming and edit dis- 
tances, is immediate from the previous modification. It is not difhcult to see that 
the introduction of costs transforms a supporting path in the domain consistency 
algorithm for regular into a zero-cost path in the modified graph. The cost of 
a shortest path from go in the first layer to a member of F in the last layer 
corresponds to the smallest number of variables forced to take a value outside 
of their domain. 

Theorem 5. A minimum-cost path from u £ to v ^ ]\f"+^ in Q^^^. cor- 
responds to a solution to sof t_regular[var] minimizing the variable-based cost 
(Hamming distance). 

Just as the existence of a path through a given arc representing a variable- value 
pair constituted a support for that pair in the filtering algorithm for regular, 
the existence of a path whose cost doesn't exceed max constitutes a support 
for that variable- value pair in a cost-based filtering algorithm for sof t jregular. 

Theorem 6. sof t_regular[var](x, Af, z) is domain consistent on x and bound 
consistent on z if and only if 

min{cost(SP((Jo,gf+^))} < minD, 

and 

min {cost(SP((7iJ,g^))-Fcost(SP((j*+\g7+i))} < maxD^, \fx^ G x,Vj e A 



where S{qk,Vj) = qi and cost(SP(u, w)) denotes the cost of a shortest path from 

U to V in Qvar- 

Computing shortest paths from the initial state in the first layer to every 
other node and from every node to a final state in the last layer can be done in 
0(n time'' through topological sorts because of the special structure of the 
graph. That computation can also be made incremental in the same way as in 
|12| . Recently, that same result was independently obtained in 0. We however 
go further by considering edit distance, for which insertions and deletions are 
allowed as well. 

For deletions we need to allow "wasting" a value without changing the current 
state. To this effect, we add to C/var an arc iql,q]^^) ^ < i < n, qk £ Q, with 
Vikk = 0, if it isn't already present in the graph. To allow insertions, inspired by 
e-transitions in DFAs, we introduce some special arcs between nodes in the same 
layer: if 3v & U such that S{qk, v) = qe then we further add an arc (g^, g^) V 1 < 
i < n + 1 with fixed positive cost. Figure |3| provides an example of the resulting 
graph (on the right). Unfortunately, those special arcs modify the structure of 
the graph since cycles (of strictly positive cost) are introduced. Consequently 
shortest paths can no longer be computed through topological sorts. An efficient 
implementation of Dijkstra's algorithm increases the time complexity to 0{n \S\ + 
n \Q\ log(n IQD). Regardless of this increase in computational cost, TheoremsEl 
and|Slcan be generalized to hold for sof t jregular[cdit] as well. 

5 Aggregating Soft Constraints 

The preceding sections have introduced filtering algorithms based on different 
violation measures for two soft global constraints. If these filtering techniques 
are to be effective, especially in the presence of soft constraints of a different 
nature, they must be able to cooperate and communicate. Even though there 
are many avenues for combining soft constraints, the objective almost always 
remains to minimize constraint violations. We propose here a small extension 
to the approach of ^^li where meta-constraints on the cost variables of soft 
constraints are introduced. We illustrate this approach with the newly introduced 
sof t_gcc. 

Definition 12 (Soft global cardinality aggregator). Let S be a set of soft 
constraints and Zi G D^. the variable indicating the violation cost of Si G S. The 
soft global cardinality aggregator (sgca) is defined as sof t_gcc[*] (Z, Z, u, Zagg) 
where Z = {zi, . . . , Z|5|}, li,Ui is the interval defining the allowed number of 
occurrences of each value in the domain of Zi and Zagg G ^z^gg ^ N the cost 
variable based on the violation measure 

When all constraints are either satisfied or violated {Z G {0, 1}'"^') the Max- 
CSP approach can be easily obtained by setting li = 0, ui = 0, violation{Z) = 

^ \5\ refers to the number of transitions in the automaton. 



SdeDz overflow{Z,d) and reading the number of violations in Zagg- The sgca 
could also be used as in |13j to enforce homogeneity (in a soft manner) or to 
define other violation measures like restricting the number of highly violated 
constraint. For instance, we could wish to impose that no more then a certain 
number of constraints are highly violated, but since we cannot guarantee that 
this is possible the use of sgca allows to state this wish without risking to 
create an inconsistent problem. More generally, by defining the values of I and 
u accordingly it is possible to limit (or at least attempt to limit) the number 
violated constraints by violation cost. Another approach could be to set all u 
to and adjust the violation function so that higher violation costs are more 
penalized. The use of soft meta-constraints, when possible, is also an alternative 
to the introduction of disjunctive constraints since they need not be satisfied for 
the problem to be consistent. 

In the original meta-constraint framework, similar behaviour can be estab- 
lished by applying a cost-gcc to Z. For instance, we can define for each pair (z;, d) 
{d G Dzi) a cost d which penalizes higher violations more. With the soft_gcc, 
this cost function can be stated as violation(Z) ~ J2deDz '^•overflow(Z, d). How- 
ever, as for this variant of the sof t_gcc we have I — 0, the sof t_gcc will be 
much more efficient than the cost-gcc, as was discussed at the end of Section |3| 
In fact, the sgca can be checked for consistency in 0(nm) time and made domain 
consistent in 0{m) time (where n = \S\ and m — Ui I-Dxj whenever I — and 
violation(Z) = J2deDz -^i^) ' overfiow(Z, d) for any cost function F : Dz 

6 Conclusion 

We have presented soft versions of two global constraints: the global cardinality 
constraint and the regular constraint. Different violation measures have been 
presented and the corresponding filtering algorithms achieving domain consis- 
tency have been introduced. These new techniques are based on the addition 
of "relaxation arcs" in the underlying graph and the use of known graph algo- 
rithms. We also have proposed to extend the Meta-Constraint framework for 
combining constraint violations by using the soft version of gcc. 

Since these two constraints are very useful to solve Personnel Rostering Prob- 
lems the next step is thus the implementation of these algorithms in order to 
model such problems and benchmark these new constraints. 
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